Skip to content

Proposal for Caching the Within-Preconditioner#1264

Open
s3alfisc wants to merge 14 commits intomasterfrom
cache-within
Open

Proposal for Caching the Within-Preconditioner#1264
s3alfisc wants to merge 14 commits intomasterfrom
cache-within

Conversation

@s3alfisc
Copy link
Copy Markdown
Member

@s3alfisc s3alfisc commented Mar 28, 2026

Summary

This PR introduces a typed fixed-effects demeaning API, passes it through the code base, and adds reusable within preconditioners to speed up repeated multi-way FE estimation.

The main user-facing additions are:

  • We introduce typed demeaner= objects:
    • MapDemeaner(...)
    • WithinDemeaner(...)
    • LsmrDemeaner(...)
  • demeaner now takes precedence over demeaner_backend, fixef_tol, and fixef_maxiter
  • feols now exposes fit.preconditioner_ for WithinDemeaner
  • users can reuse that object via WithinDemeaner(preconditioner=...)
  • WithinPreconditioner can be pickled across sessions, following the upstream within-py pattern

What Changed

1. Typed demeaner API

We now support a typed demeaner= interface instead of relying only on string backend selection via demeaner_backend. This makes the demeaner backend configuration explicit, extensible, and easier to validate.

The typed demeaner objects control actual runtime execution.

  • WithinDemeaner options flowing into the within backend
  • LsmrDemeaner options flowing into the SciPy/CuPy solvers
  • MapDemeaner options flowing into the MAP solvers

2. Reusable within preconditioners

For multi-way FE WithinDemeaner fits, we now build and cache a reusable WithinPreconditioner.

That preconditioner is:

  • exposed on feols as fit.preconditioner_
  • reusable across later estimations via WithinDemeaner(preconditioner=...)
  • reused internally for multiple estimation syntax
  • reused internally across IWLS iterations in feglm and fepois

3. IWLS reuse

For feglm and fepois, we now reuse a fit-local within preconditioner across IWLS iterations.

Current policy is intentionally simple:

  • build once
  • reuse by default
  • refresh once when inner FE tolerance tightens

@schroedk - For generalized linear models, we need to demean repeatedly with "different weights". Here we currently say it is better to run on "stale" preconditions for a while than to update the preconditioner in every iteration. This can of course be refined later.

4. Persistence

WithinPreconditioner is pickleable across sessions.

This mirrors within-py closely:

  • the Rust FePreconditioner is serialized with postcard
  • Python pickle support is implemented through __reduce__

Key Design Choices

Why expose preconditioners, but not solvers?

We intentionally expose preconditioners as reusable objects, but not solvers.

Because reusing a within preconditioner is where most of the practical speedup comes from. It is "ok" if the preconditioner becomes "a bit stale" - this is a feature, not a bug imo. Plus, we do not need to check that the fixed effects and the sample and weights are identical, which is what we would have needed to do if we allowed users to pass solvers.

For iterated weighted least squares as used for GLMs, the preconditioner is the better choice as in each iteration, a stale preconditioner will be "good enough".

Stay close to within-py

Tried to stay close to the implementation in within-py.

In particular:

  • preconditioners are built through the same conceptual flow as upstream
  • reused preconditioners are passed back into the solver through the same upstream abstractions
  • persistence uses the same postcard + pickle pattern as within-py

Workflow

Typical usage now looks like:

fit1 = pf.feols(
    "Y ~ X1 | f1 + f2",
    data=data,
    demeaner=pf.WithinDemeaner(),
)

# use the cached preconditioner for the next fit
fit2 = pf.feols(
    "Y ~ X2 | f1 + f2",
    data=data,
    demeaner=pf.WithinDemeaner(preconditioner=fit1.preconditioner_),
)

For persistent reuse, you can do:

import pickle

payload = pickle.dumps(fit1.preconditioner_)
restored = pickle.loads(payload)

And then fit the regression model.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 69.91150% with 136 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyfixest/core/demean.py 45.97% 47 Missing ⚠️
pyfixest/demeaners.py 67.22% 39 Missing ⚠️
pyfixest/estimation/internals/demeaner_options.py 62.68% 25 Missing ⚠️
pyfixest/estimation/internals/demean_.py 73.41% 21 Missing ⚠️
pyfixest/core/collinear.py 87.50% 1 Missing ⚠️
pyfixest/estimation/cupy/demean_cupy_.py 80.00% 1 Missing ⚠️
pyfixest/estimation/models/feglm_.py 94.44% 1 Missing ⚠️
pyfixest/estimation/models/fepois_.py 94.44% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (c533be2) and HEAD (dbb75cf). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (c533be2) HEAD (dbb75cf)
tests-extended 1 0
test-r-fixest 1 0
test-r-core 1 0
Flag Coverage Δ
core-tests 72.86% <69.84%> (-0.22%) ⬇️
test-r-core ?
test-r-extended 19.57% <37.16%> (+0.99%) ⬆️
test-r-fixest ?
tests-extended ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pyfixest/__init__.py 81.81% <100.00%> (ø)
pyfixest/estimation/FixestMulti_.py 78.12% <100.00%> (-3.99%) ⬇️
pyfixest/estimation/api/feglm.py 91.42% <100.00%> (+1.42%) ⬆️
pyfixest/estimation/api/feols.py 100.00% <100.00%> (ø)
pyfixest/estimation/api/fepois.py 96.77% <100.00%> (+0.62%) ⬆️
pyfixest/estimation/internals/literals.py 87.50% <ø> (ø)
pyfixest/estimation/models/fegaussian_.py 87.50% <100.00%> (+0.40%) ⬆️
pyfixest/estimation/models/feiv_.py 84.54% <100.00%> (-2.73%) ⬇️
pyfixest/estimation/models/felogit_.py 88.88% <100.00%> (+0.31%) ⬆️
pyfixest/estimation/models/feols_.py 84.72% <100.00%> (-7.71%) ⬇️
... and 12 more

... and 14 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator

@leostimpfle leostimpfle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very nice! After an initial look (mostly at the design), I have three (fairly minor) comments:

  • I would favour more consistent use of typing.Literal to avoid passing str arguments (e.g., preconditioner_type)
  • Many of the object.__setattr__ inside the frozen dataclasses can probably be avoided so that __post_init__ just calls validations (but does not modify provided arguments)
  • I'm wondering if adding a demean method to BaseDemeaner would be nicer than the explicit dispatch_demean

Comment thread pyfixest/core/demean.py
n_obs: int,
n_factors: int,
factor_cardinalities: tuple[int, ...],
preconditioner_type: str,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a typing.Literal for clarity here and anywhere preconditioner_type is used? For example, PreconditionerType = Literal["additive", "multiplicative"]

Comment thread pyfixest/demeaners.py
class WithinDemeaner(BaseDemeaner):
"""Krylov-based demeaner configuration for the Rust `within` backend."""

krylov_method: str = "cg"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use a typing.Literal, e.g., Literal["cg", "gmres"]?

Comment thread pyfixest/demeaners.py
class LsmrDemeaner(BaseDemeaner):
"""Sparse LSMR demeaner for CPU and GPU backends."""

precision: str = "float64"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Literal["float32", "float64"]?

Comment thread pyfixest/core/demean.py
krylov_method: str,
preconditioner_type: str,
) -> tuple[str, str]:
krylov_method = krylov_method.lower()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this kind of string parsing or should we tighten the arguments to a Literal (see other comments)?

maxiter: int = 1_000,
krylov_method: str = "cg",
gmres_restart: int = 30,
preconditioner_type: str = "additive",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tighten to Literal["additive", "multiplicative"]?

Comment thread pyfixest/demeaners.py
Comment on lines +37 to +46
object.__setattr__(
self,
"fixef_tol",
_validate_positive_float(self.fixef_tol, "fixef_tol"),
)
object.__setattr__(
self,
"fixef_maxiter",
_validate_positive_int(self.fixef_maxiter, "fixef_maxiter"),
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call _validate_positive_float and _validate_positive_int instead of reassigning the attributes?

Comment thread pyfixest/demeaners.py
backend = self.backend.lower()
if backend not in {"numba", "rust", "jax"}:
raise ValueError("`backend` must be one of 'numba', 'rust', or 'jax'.")
object.__setattr__(self, "backend", backend)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid the reassign when tightening backend to be case sensitive (consistent with the type hint)?

)
from pyfixest.estimation.internals.literals import DemeanerBackendOptions

ResolvedDemeaner: TypeAlias = AnyDemeaner
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this introduced? Couldn't we just type hint with AnyDemeaner?

return replace(demeaner, fixef_tol=tol)


def dispatch_demean(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about giving demeaner a method demean that avoids this dispatch function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants